Information Seeking Spoken Dialogue Systems- Part II: Multimodal Dialogue

نویسندگان

  • Alexandros Potamianos
  • Eric Fosler-Lussier
  • Egbert Ammicht
  • Manolis Perakakis
چکیده

In this paper, the task and user interface modules of a multimodal dialogue system development platform are presented. The main goal of this work is to provide a simple, application-independent solution to the problem of multimodal dialogue design for information seeking applications. The proposed system architecture clearly separates the task and interface components of the system. A task manager is designed and implemented that consists of two main sub-modules: the electronic form module that handles the list of attributes that have to be instantiated by the user, and the agenda module that contains the sequence of user and system tasks. Both the electronic forms and the agenda can be dynamically updated by the user. Next a spoken dialogue module is designed that implements the speech interface for the task manager. The dialogue manager can handle complex error correction and clarification user input, building on the semantics and pragmatic modules presented in [4]. The spoken dialogue system is evaluated for a travel reservation task of the DARPA Communicator research program and shown to yield over 90% task completion and good performance for both objective and subjective evaluation metrics. Finally, a multimodal dialogue system which combines graphical and speech interfaces, is designed, implemented and evaluated. Minor modifications to the unimodal semantic and pragmatic modules were required to build the multimodal system. It is shown that the multimodal system significantly outperforms the unimodal speech-only system both in terms of efficiency (task success and time to completion) and user satisfaction for a travel reservation task. EDICS Category: 3-DIAL 3-MODA 3-INTF Alexandros Potamianos is with the Dept. of Electronics & Computer Engineering, Technical Univ. of Crete, Chania 73100, Greece; email: [email protected]; tel:+30-28210-37221; fax:+30-28210-37542. Eric Fosler-Lussier is with the Dept. of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA; email: [email protected]; tel: +1-614-292-4890; fax: +1-614-292-2911. Egbert Ammicht is with Bell Labs, Lucent Technologies, Whippany, NJ 07981, USA; email: [email protected]; tel: +1-973-386-7139. Manolis Perakakis is with the Dept. of Electronics & Computer Engineering, Technical Univ. of Crete, Chania 73100, Greece; email: [email protected]; tel:+30-28210-37368; fax:+30-28210-37542. Some of this work was performed while Alexandros Potamianos and Eric Fosler-Lussier were with Bell Labs, Lucent Technologies, Murray Hill, NJ 07974, USA. TRANS. ON MULTIMEDIA, VOL.X, NO.XX 2 Information Seeking Spoken Dialogue Systems – Part II: Multimodal Dialogue

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Testbed for Evaluating Multimodal Dialogue Systems for Small Screen Devices

This paper discusses the requirements for developing a multimodal spoken dialogue system for mobile phone applications. Since visual output as part of the multimodal system is limited through the restricted screen size of mobile phones, research in the field of information visualisation for small screen devices are discussed and combinations of these techniques with spoken output are sketched. ...

متن کامل

Dialogue Modeling for Speech Generation in Multimodal Information Systems

Conversational approaches to cooperative human computer interaction have mostly been developed for natural language interfaces Recently we ob serve an increasing number of conversational ap proaches to multimodal interfaces as well see e g Maybury Carbonell They claim that linguistic contributions to the dialogue written or spoken as well as graphical operations are to be in terpreted as commun...

متن کامل

Modeling and Guiding Cooperative Multimodal Dialogues

In this paper we claim that a consistent conversational approach to human-computer interaction can be applied feasibly to multimodal interaction. A comprehensive conversational model is presented that covers interrelated levels of the dialogue structure, i.e. illocutionary, rhetorical, and topical aspects. It thus provides the basis for a consistent interpretation of the linguistic as well as g...

متن کامل

Developing Multimodal Spoken Dialogue Systems : Empirical Studies of Spoken Human-Computer Interaction

This paper describes the data collected in Wizard of Oz experimentsin a spoken dialogue system, WAXHOLM, that provides informationon boat traffic in the Stockholm archipelago. The data consist ofutterance-length speech files, their corresponding transcriptions,and log files of the dialogue sessions. Apart from the spontaneousdialogue speech, the speech material also comp...

متن کامل

Information Seeking Spoken Dialogue Systems- Part I: Semantics and Pragmatics

In this paper, the semantic and pragmatic modules of a spoken dialogue system development platform are presented and evaluated. The main goal of this research is to create spoken dialogue system modules that are portable across applications domains and interaction modalities. We propose a hierarchical semantic representation that encodes all information supplied by the user over multiple dialog...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Multimedia

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2007